Comments for MEDB 5502, Week 06

Topics to be covered

  • What you will learn
    • Test of two proportions
    • Chi-square test of independence
    • Odds ratio versus relative risk
    • Concepts behind the logistic regression model
    • Logistic regression with categorical variables
    • Logistic regression with interactions
    • To be determined
    • To be determined

Comparing two binary outcomes

  • Is there a difference in the proportion of deaths between male passengers and female passengers on the Titanic?
  • Is there difference in the proportion of patients finishing the full three doses of HPV vaccine between Black women and White women?
  • Does using a ng tube for feeding in pre-term infants increase the probability of successful breast feeding at six months?

Other comparisons involving a binary outcome

  • Is there are difference in the proportion of deaths between first class, second class, and third class passengers?
  • Does age influence the proportion of women finishing the full three doses of HPV vaccine?
  • Controlling for the mother’s age, does using a ng tube for feeding in pre-term infants increase the probability of successful breast feeding at six months?

Hypothesis framework

  • \(H_0:\ \pi_1=\pi_2\)
  • \(H_1:\ \pi_1=\pi_2\)
  • Compute \(\hat p_1\) and \(\hat p_2\) from samples
  • Accept \(H_0\) if \(\hat p_1-\hat p_2\) is close to zero.
    • \(T=(\hat p_1-\hat p_2)/s.e.\)
    • 95% CI: \((\hat p_1-\hat p_2) \pm Z_{\alpha/2}s.e.\)

Data layout, 1 of 2

Data layout, 2 of 2

Confidence interval and test of hypothesis

Live demo, Test of two proportions

Break #1

  • What you have learned
    • Test of two proportions
  • What’s coming next
    • Chi-square test of independence

Chi-square test of independence, 1 of 2

  • Equivalent to test of two proportions
  • Lay out data in two by two table
\[\begin{matrix} & No\ event & Event \\ Treatment & O_{11} & O_{12}\\ Control & O_{21} & O_{22} \end{matrix}\]

Chi-square test of independence, 2 of 2

\[\begin{matrix} & No\ event & Event \\ Treatment & E_{11} = n_1 (1-\hat p_.) & E_{12}=n_1 \hat p_.\\ Control & E_{21} = n_2 (1-\hat p_.) & E_{22}=n_2 \hat p_. \end{matrix}\]
  • \(X^2=\Sigma \frac{(O_{ij}-E_{ij})^2}{E_{ij}}\)

Example: Titanic survival by sex

  • Moderate or large sample size: Pearson Chi-Square
  • Small sample size: Fisher’s Exact test

Live demo, Chi-square test of independence

Break #2

  • What you have learned
    • Chi-square test of independence
  • What’s coming next
    • Odds ratio versus relative risk

Titanic data

       Survived   Died  Total
Female   308      154     462
Male     142      709     851
Total    450      863   1,313

Titanic data, odds of death

       Survived   Died  Total  Odds
Female   308      154     462  2     to 1 against
Male     142      709     851  4.993 to 1 in favor
Total    450      863   1,313

Odds ratio = 4.993 / 0.5 = 9.986

Titanic data, probability of death

       Survived   Died  Total  Probability
Female   308      154     462    0.3333
Male     142      709     851    0.8331
Total    450      863   1,313

Relative risk = 0.8331 / 0.3333 = 2.5

Which is better

  • Relative risk is consistent with how most people think, but
    • Relative risk cannot always be computerd
    • Relative risk has an ambiguity

Fractions are funny

  ---------------- ----------------
     0.8 (4/5)        1.25 (5/4)  
     0.75 (3/4)       1.33 (4/3)  
     0.67 (2/3)       1.50 (3/2)  
     0.50 (1/2)       2.00 (2/1)  
  ---------------- ----------------

Swapping the numerator and denominator

Interpretability, 1 of 3

  • Change from 25% probability to 50% probability
  • Change from 3 to 1 odds against to even odds
    • RR = 2, OR = 3

Interpretability, 2 of 3

  • Change from 25% probability to 75% probability
  • Change from 3 to 1 odds against to 3 to 1 odds in favor
    • RR = 3, OR = 9

Interpretability, 3 of 3

  • Change from 10% probability to 90% probability
  • Change from 9 to 1 odds against to 9 to 1 odds in favor
    • RR = 9, OR = 81

Designs that rule out the use of the relative risk, 1 of 2

  ------------- ------------------ -------------- -----------
                   Cancer cases       Controls       Total  
     Balding            72               82           154  
      Hairy             55               57           112  
      Total            129              139           268  
  ------------- ------------------ -------------- -----------

Designs that rule out the use of the relative risk, 2 of 2

  ------------- ------------------- ------------------- -----------
                   Heart disease          Healthy          Total  
     Balding        127 (9.4%)         1,224 (90.6%)       1,351  
      Hairy         548 (6.7%)         7,611 (93.3%)       8,159  
      Total             675                8,835           9,510  
  ------------- ------------------- ------------------- -----------

Covariate adjustments

  -------------- --------------- ----------------- -----------
                    Children        No children       Total  
     Epilepsy       232 (40%)        354 (60%)         586  
     Control        79 (72%)         30 (28%)          109  
      Total            311              384            695  
  -------------- --------------- ----------------- -----------

Ambiguous and confusing situations

  • One hundred pound sack of potatoes
    • 99% water, 1% potato
    • Weighs 1 pound after completely drying
    • Instead dry until 2% potato
      • How much does it weigh then?

Example: physician recommendations

  -------------------- ---------------- ----------------- -----------
                           No cath            Cath           Total  
      Male patient        34 (9.4%)        326 (90.6%)        360  
     Female patient       55 (15.3%)       305 (84.7%)        360  
         Total                89               631            720  
  -------------------- ---------------- ----------------- -----------

Example: breast feeding study

  --------------- ------------------ ---------------- -----------
                     Continued bf       Stopped bf       Total  
     Treatment        19 (37.3%)        32 (62.7%)        51  
      Control          5 (8.8%)         52 (91.2%)        57  
       Total              24                84            108  
  --------------- ------------------ ---------------- -----------

Break #3

  • What you have learned
    • Odds ratio versus relative risk
  • What’s coming next
    • Concepts behind the logistic regression model

What is logistic regression?

  • Binary outcome
  • Categorical or continuous predictors
  • Linear on the log odds scale

Why log odds?

  • Statistical model of surgery
    • Estimates probability of demise
    • First prediction: probability=1.2
  • Log odds prevent out of range predictions

A linear model for probability, 1 of 2

A linear model of probability, 2 of 2

A multiplicative model for probability

The relationship between odds and probability

  • odds = prob / (1-prob)
  • prob = odds / (1+odds)
    • \(0 \le\) prob \(\le 1\)
    • \(0 \le\) odds \(\le \infty\)
      • \(0 \le\) odds against \(\le 1\)
      • \(1 \le\) odds in favor \(\le \infty\)

A log odds model for probability, 1 of 4

A log odds model for probability, 2 of 4

A log odds model for probability, 3 of 4

A log odds model for probability, 4 of 4

An example of a log odds model with real data, 1 of 3

An example of a log odds model with real data, 2 of 3

An example of a log odds model with real data, 3 of 3

  • log odds = -16.72 + 0.577 \(\times\) 30 = 0.59
  • odds = exp(log odds) = 1.8
  • prob = odds / (1+odds`) = 0.64

Live demo, Concepts behind the logistic regression model

Break #4

  • What you have learned
    • Concepts behind the logistic regression model
  • What’s coming next
    • Logistic regression with categorical variables

Categorical variables in a logistic regression model, 1 of 3

  • 1st class odds: 129/193 = 0.67 or 193/129 = 1.5

Categorical variables in a logistic regression model, 2 of 3

Categorical variables in a logistic regression model, 3 of 3

Live demo, Logistic regression with categorical variables

Break #5

  • What you have learned
    • Logistic regression with categorical variables
  • What’s coming next
    • Logistic regression with interactions

Interactions in logistic regression

  • Odds ratios vary by a third factor
  • Interpretation is more tedious

Odds ratios for first class

Odds ratio for second class

Odds ratio for third class

Logistic regression with interaction

Live demo, Logistic regression with interactions

Break #6

  • What you have learned
    • Logistic regression with interactions
  • What’s coming next
    • To be determined

Slide 06-07

Live demo, To be determined

Break #7

  • What you have learned
    • To be determined
  • What’s coming next
    • To be determined

Slide 06-08

Live demo, To be determined

Summary

  • What you have learned
    • Test of two proportions
    • Chi-square test of independence
    • Odds ratio versus relative risk
    • Concepts behind the logistic regression model
    • Logistic regression with categorical variables
    • Logistic regression with interactions
    • To be determined
    • To be determined

Additional topics??